Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 5530 |
| Missing cells | 5933 |
| Missing cells (%) | 7.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 648.2 KiB |
| Average record size in memory | 120.0 B |
Variable types
| Categorical | 6 |
|---|---|
| Numeric | 9 |
CUST_ID has a high cardinality: 5530 distinct values | High cardinality |
CASH_ADVANCE has a high cardinality: 2609 distinct values | High cardinality |
PURCHASES_TRX has a high cardinality: 80 distinct values | High cardinality |
MINIMUM_PAYMENTS has a high cardinality: 5441 distinct values | High cardinality |
PURCHASES is highly correlated with ONEOFF_PURCHASES_FREQUENCY | High correlation |
ONEOFF_PURCHASES_FREQUENCY is highly correlated with PURCHASES | High correlation |
BALANCE is highly correlated with CASH_ADVANCE_TRX and 1 other fields | High correlation |
PURCHASES is highly correlated with PURCHASES_FREQUENCY and 1 other fields | High correlation |
CASH_ADVANCE_TRX is highly correlated with BALANCE and 1 other fields | High correlation |
PURCHASES_FREQUENCY is highly correlated with PURCHASES and 1 other fields | High correlation |
ONEOFF_PURCHASES_FREQUENCY is highly correlated with PURCHASES | High correlation |
CASH_ADVANCE_FREQUENCY is highly correlated with BALANCE and 2 other fields | High correlation |
PURCHASES is highly correlated with PURCHASES_FREQUENCY | High correlation |
CASH_ADVANCE_TRX is highly correlated with CASH_ADVANCE_FREQUENCY | High correlation |
PURCHASES_FREQUENCY is highly correlated with PURCHASES | High correlation |
CASH_ADVANCE_FREQUENCY is highly correlated with CASH_ADVANCE_TRX | High correlation |
PURCHASES is highly correlated with PURCHASES_TRX and 1 other fields | High correlation |
PURCHASES_TRX is highly correlated with PURCHASES and 1 other fields | High correlation |
CREDIT_LIMIT is highly correlated with BALANCE | High correlation |
ONEOFF_PURCHASES_FREQUENCY is highly correlated with PURCHASES_TRX | High correlation |
BALANCE is highly correlated with CREDIT_LIMIT | High correlation |
PAYMENTS is highly correlated with PURCHASES | High correlation |
GENDER has 2714 (49.1%) missing values | Missing |
CASH_ADVANCE_TRX has 150 (2.7%) missing values | Missing |
ONEOFF_PURCHASES_FREQUENCY has 2740 (49.5%) missing values | Missing |
CASH_ADVANCE_FREQUENCY has 166 (3.0%) missing values | Missing |
TENURE has 163 (2.9%) missing values | Missing |
CUST_ID is uniformly distributed | Uniform |
CUST_ID has unique values | Unique |
PAYMENTS has unique values | Unique |
PURCHASES has 1393 (25.2%) zeros | Zeros |
CASH_ADVANCE_TRX has 2812 (50.8%) zeros | Zeros |
PURCHASES_FREQUENCY has 1392 (25.2%) zeros | Zeros |
ONEOFF_PURCHASES_FREQUENCY has 1464 (26.5%) zeros | Zeros |
CASH_ADVANCE_FREQUENCY has 2801 (50.7%) zeros | Zeros |
Reproduction
| Analysis started | 2022-03-06 19:10:41.020455 |
|---|---|
| Analysis finished | 2022-03-06 19:10:53.729491 |
| Duration | 12.71 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 5530 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 43.3 KiB |
| C14071 | 1 |
|---|---|
| C14465 | 1 |
| C17245 | 1 |
| C15632 | 1 |
| C13761 | 1 |
| Other values (5525) |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 33180 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 5530 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | C12529 |
|---|---|
| 2nd row | C14138 |
| 3rd row | C15409 |
| 4th row | C18141 |
| 5th row | C15879 |
Common Values
| Value | Count | Frequency (%) |
| C14071 | 1 | < 0.1% |
| C14465 | 1 | < 0.1% |
| C17245 | 1 | < 0.1% |
| C15632 | 1 | < 0.1% |
| C13761 | 1 | < 0.1% |
| C14336 | 1 | < 0.1% |
| C15541 | 1 | < 0.1% |
| C16112 | 1 | < 0.1% |
| C17180 | 1 | < 0.1% |
| C11150 | 1 | < 0.1% |
| Other values (5520) | 5520 |
Length
| Value | Count | Frequency (%) |
| c12941 | 1 | < 0.1% |
| c11435 | 1 | < 0.1% |
| c17341 | 1 | < 0.1% |
| c15178 | 1 | < 0.1% |
| c17814 | 1 | < 0.1% |
| c13638 | 1 | < 0.1% |
| c16928 | 1 | < 0.1% |
| c10794 | 1 | < 0.1% |
| c17777 | 1 | < 0.1% |
| c10800 | 1 | < 0.1% |
| Other values (5520) | 5520 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 7787 | |
| C | 5530 | |
| 8 | 2324 | 7.0% |
| 7 | 2286 | 6.9% |
| 6 | 2276 | 6.9% |
| 4 | 2265 | 6.8% |
| 5 | 2244 | 6.8% |
| 2 | 2231 | 6.7% |
| 3 | 2222 | 6.7% |
| 0 | 2213 | 6.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 27650 | |
| Uppercase Letter | 5530 | 16.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 7787 | |
| 8 | 2324 | 8.4% |
| 7 | 2286 | 8.3% |
| 6 | 2276 | 8.2% |
| 4 | 2265 | 8.2% |
| 5 | 2244 | 8.1% |
| 2 | 2231 | 8.1% |
| 3 | 2222 | 8.0% |
| 0 | 2213 | 8.0% |
| 9 | 1802 | 6.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 5530 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 27650 | |
| Latin | 5530 | 16.7% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 7787 | |
| 8 | 2324 | 8.4% |
| 7 | 2286 | 8.3% |
| 6 | 2276 | 8.2% |
| 4 | 2265 | 8.2% |
| 5 | 2244 | 8.1% |
| 2 | 2231 | 8.1% |
| 3 | 2222 | 8.0% |
| 0 | 2213 | 8.0% |
| 9 | 1802 | 6.5% |
Latin
| Value | Count | Frequency (%) |
| C | 5530 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 33180 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 7787 | |
| C | 5530 | |
| 8 | 2324 | 7.0% |
| 7 | 2286 | 6.9% |
| 6 | 2276 | 6.9% |
| 4 | 2265 | 6.8% |
| 5 | 2244 | 6.8% |
| 2 | 2231 | 6.7% |
| 3 | 2222 | 6.7% |
| 0 | 2213 | 6.7% |
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 2714 |
| Missing (%) | 49.1% |
| Memory size | 43.3 KiB |
| F | |
|---|---|
| M |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2816 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | F |
|---|---|
| 2nd row | F |
| 3rd row | F |
| 4th row | F |
| 5th row | M |
Common Values
| Value | Count | Frequency (%) |
| F | 1443 | |
| M | 1373 | |
| (Missing) | 2714 |
Length
Pie chart
| Value | Count | Frequency (%) |
| f | 1443 | |
| m | 1373 |
Most occurring characters
| Value | Count | Frequency (%) |
| F | 1443 | |
| M | 1373 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2816 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 1443 | |
| M | 1373 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2816 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 1443 | |
| M | 1373 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2816 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| F | 1443 | |
| M | 1373 |
| Distinct | 5525 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1041.700463 |
| Minimum | -4587.892398 |
|---|---|
| Maximum | 7390.19856 |
| Zeros | 6 |
| Zeros (%) | 0.1% |
| Negative | 165 |
| Negative (%) | 3.0% |
| Memory size | 43.3 KiB |
Quantile statistics
| Minimum | -4587.892398 |
|---|---|
| 5-th percentile | 3.88240525 |
| Q1 | 74.060304 |
| median | 632.7436345 |
| Q3 | 1545.808455 |
| 95-th percentile | 3869.371332 |
| Maximum | 7390.19856 |
| Range | 11978.09096 |
| Interquartile range (IQR) | 1471.748151 |
Descriptive statistics
| Standard deviation | 1353.093044 |
|---|---|
| Coefficient of variation (CV) | 1.29892718 |
| Kurtosis | 3.290218207 |
| Mean | 1041.700463 |
| Median Absolute Deviation (MAD) | 594.745598 |
| Skewness | 1.475458824 |
| Sum | 5760603.559 |
| Variance | 1830860.785 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 6 | 0.1% |
| 1132.615315 | 1 | < 0.1% |
| 1.155609 | 1 | < 0.1% |
| 911.281862 | 1 | < 0.1% |
| 28.486124 | 1 | < 0.1% |
| 1839.93046 | 1 | < 0.1% |
| 1886.811282 | 1 | < 0.1% |
| 15.728894 | 1 | < 0.1% |
| 1155.338824 | 1 | < 0.1% |
| 25.950664 | 1 | < 0.1% |
| Other values (5515) | 5515 |
| Value | Count | Frequency (%) |
| -4587.892398 | 1 | |
| -4530.639094 | 1 | |
| -4251.411617 | 1 | |
| -4071.993764 | 1 | |
| -3948.776884 | 1 | |
| -3876.778302 | 1 | |
| -3699.694691 | 1 | |
| -3474.972612 | 1 | |
| -3433.295973 | 1 | |
| -3207.605367 | 1 |
| Value | Count | Frequency (%) |
| 7390.19856 | 1 | |
| 7347.355967 | 1 | |
| 7293.108794 | 1 | |
| 7215.745096 | 1 | |
| 7152.864372 | 1 | |
| 7005.310696 | 1 | |
| 6980.228444 | 1 | |
| 6958.239974 | 1 | |
| 6950.583049 | 1 | |
| 6943.433775 | 1 |
PURCHASES
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 3682 |
|---|---|
| Distinct (%) | 66.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 534.5771031 |
| Minimum | 0 |
|---|---|
| Maximum | 9661.37 |
| Zeros | 1393 |
| Zeros (%) | 25.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 269.13 |
| Q3 | 723.7 |
| 95-th percentile | 1975.906 |
| Maximum | 9661.37 |
| Range | 9661.37 |
| Interquartile range (IQR) | 723.7 |
Descriptive statistics
| Standard deviation | 773.4887449 |
|---|---|
| Coefficient of variation (CV) | 1.446917087 |
| Kurtosis | 18.5878817 |
| Mean | 534.5771031 |
| Median Absolute Deviation (MAD) | 269.13 |
| Skewness | 3.268794177 |
| Sum | 2956211.38 |
| Variance | 598284.8385 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1393 | 25.2% |
| 45.65 | 21 | 0.4% |
| 150 | 14 | 0.3% |
| 60 | 12 | 0.2% |
| 450 | 10 | 0.2% |
| 100 | 10 | 0.2% |
| 50 | 9 | 0.2% |
| 250 | 9 | 0.2% |
| 600 | 9 | 0.2% |
| 120 | 9 | 0.2% |
| Other values (3672) | 4034 |
| Value | Count | Frequency (%) |
| 0 | 1393 | |
| 0.01 | 3 | 0.1% |
| 0.05 | 1 | < 0.1% |
| 0.24 | 1 | < 0.1% |
| 1 | 2 | < 0.1% |
| 4.8 | 1 | < 0.1% |
| 4.99 | 1 | < 0.1% |
| 6.9 | 1 | < 0.1% |
| 7.26 | 1 | < 0.1% |
| 8.4 | 3 | 0.1% |
| Value | Count | Frequency (%) |
| 9661.37 | 1 | |
| 8945.67 | 1 | |
| 8834.96 | 1 | |
| 8591.31 | 1 | |
| 7311.99 | 1 | |
| 6520 | 1 | |
| 6398.73 | 1 | |
| 5855.46 | 1 | |
| 5812.17 | 1 | |
| 5788.81 | 1 |
BALANCE_FREQUENCY
Real number (ℝ≥0)
| Distinct | 58 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.48255227 |
| Minimum | 0 |
|---|---|
| Maximum | 1000 |
| Zeros | 6 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.363636 |
| Q1 | 0.833333 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1000 |
| Range | 1000 |
| Interquartile range (IQR) | 0.166667 |
Descriptive statistics
| Standard deviation | 152.899316 |
|---|---|
| Coefficient of variation (CV) | 5.773586866 |
| Kurtosis | 34.06665053 |
| Mean | 26.48255227 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.96293972 |
| Sum | 146448.5141 |
| Variance | 23378.20083 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 3554 | |
| 0.909091 | 275 | 5.0% |
| 0.818182 | 188 | 3.4% |
| 0.545455 | 158 | 2.9% |
| 0.636364 | 147 | 2.7% |
| 0.727273 | 145 | 2.6% |
| 0.454545 | 135 | 2.4% |
| 0.363636 | 125 | 2.3% |
| 1000 | 111 | 2.0% |
| 0.272727 | 110 | 2.0% |
| Other values (48) | 582 | 10.5% |
| Value | Count | Frequency (%) |
| 0 | 6 | 0.1% |
| 0.090909 | 23 | 0.4% |
| 0.1 | 1 | < 0.1% |
| 0.125 | 2 | < 0.1% |
| 0.142857 | 1 | < 0.1% |
| 0.166667 | 1 | < 0.1% |
| 0.181818 | 89 | |
| 0.2 | 5 | 0.1% |
| 0.222222 | 2 | < 0.1% |
| 0.25 | 4 | 0.1% |
| Value | Count | Frequency (%) |
| 1000 | 111 | |
| 909.091 | 9 | 0.2% |
| 888.889 | 1 | < 0.1% |
| 857.143 | 2 | < 0.1% |
| 833.333 | 1 | < 0.1% |
| 818.182 | 7 | 0.1% |
| 727.273 | 3 | 0.1% |
| 636.364 | 6 | 0.1% |
| 545.455 | 4 | 0.1% |
| 454.545 | 3 | 0.1% |
| Distinct | 2609 |
|---|---|
| Distinct (%) | 47.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 43.3 KiB |
| 0.0 | |
|---|---|
| ?? | 75 |
| 0.0?ñ | 41 |
| 823.979128 | 1 |
| 181.735735 | 1 |
| Other values (2604) |
Length
| Max length | 13 |
|---|---|
| Median length | 3 |
| Mean length | 6.456057866 |
| Min length | 2 |
Characters and Unicode
| Total characters | 35702 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 2606 ? |
|---|---|
| Unique (%) | 47.1% |
Sample
| 1st row | 472.818286 |
|---|---|
| 2nd row | 642.862505 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 2183.782456 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 2808 | |
| ?? | 75 | 1.4% |
| 0.0?ñ | 41 | 0.7% |
| 823.979128 | 1 | < 0.1% |
| 181.735735 | 1 | < 0.1% |
| 110.795488 | 1 | < 0.1% |
| 233.246267 | 1 | < 0.1% |
| 148.542419 | 1 | < 0.1% |
| 1116.128466 | 1 | < 0.1% |
| 56.644654 | 1 | < 0.1% |
| Other values (2599) | 2599 |
Length
| Value | Count | Frequency (%) |
| 0.0 | 2808 | |
| 75 | 1.4% | |
| 0.0?ñ | 41 | 0.7% |
| 181.735735 | 1 | < 0.1% |
| 2327.566908 | 1 | < 0.1% |
| 110.795488 | 1 | < 0.1% |
| 233.246267 | 1 | < 0.1% |
| 148.542419 | 1 | < 0.1% |
| 1116.128466 | 1 | < 0.1% |
| 56.644654 | 1 | < 0.1% |
| Other values (2599) | 2599 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 7537 | |
| . | 5455 | |
| 1 | 3119 | |
| 2 | 2707 | 7.6% |
| 3 | 2485 | 7.0% |
| 4 | 2431 | 6.8% |
| 9 | 2430 | 6.8% |
| 8 | 2375 | 6.7% |
| 5 | 2313 | 6.5% |
| 7 | 2303 | 6.5% |
| Other values (3) | 2547 | 7.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 29929 | |
| Other Punctuation | 5689 | 15.9% |
| Lowercase Letter | 84 | 0.2% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 7537 | |
| 1 | 3119 | |
| 2 | 2707 | 9.0% |
| 3 | 2485 | 8.3% |
| 4 | 2431 | 8.1% |
| 9 | 2430 | 8.1% |
| 8 | 2375 | 7.9% |
| 5 | 2313 | 7.7% |
| 7 | 2303 | 7.7% |
| 6 | 2229 | 7.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 5455 | |
| ? | 234 | 4.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| ñ | 84 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 35618 | |
| Latin | 84 | 0.2% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 7537 | |
| . | 5455 | |
| 1 | 3119 | |
| 2 | 2707 | 7.6% |
| 3 | 2485 | 7.0% |
| 4 | 2431 | 6.8% |
| 9 | 2430 | 6.8% |
| 8 | 2375 | 6.7% |
| 5 | 2313 | 6.5% |
| 7 | 2303 | 6.5% |
| Other values (2) | 2463 | 6.9% |
Latin
| Value | Count | Frequency (%) |
| ñ | 84 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 35618 | |
| Latin 1 Sup | 84 | 0.2% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 7537 | |
| . | 5455 | |
| 1 | 3119 | |
| 2 | 2707 | 7.6% |
| 3 | 2485 | 7.0% |
| 4 | 2431 | 6.8% |
| 9 | 2430 | 6.8% |
| 8 | 2375 | 6.7% |
| 5 | 2313 | 6.5% |
| 7 | 2303 | 6.5% |
| Other values (2) | 2463 | 6.9% |
Latin 1 Sup
| Value | Count | Frequency (%) |
| ñ | 84 |
| Distinct | 34 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 150 |
| Missing (%) | 2.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 49.11542751 |
| Minimum | 0 |
|---|---|
| Maximum | 18000 |
| Zeros | 2812 |
| Zeros (%) | 50.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 3 |
| 95-th percentile | 12 |
| Maximum | 18000 |
| Range | 18000 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 573.8177709 |
|---|---|
| Coefficient of variation (CV) | 11.68304543 |
| Kurtosis | 469.4166907 |
| Mean | 49.11542751 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 19.33841254 |
| Sum | 264241 |
| Variance | 329266.8342 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2812 | |
| 1 | 562 | 10.2% |
| 2 | 393 | 7.1% |
| 3 | 290 | 5.2% |
| 4 | 234 | 4.2% |
| 5 | 204 | 3.7% |
| 6 | 159 | 2.9% |
| 7 | 130 | 2.4% |
| 8 | 105 | 1.9% |
| 10 | 83 | 1.5% |
| Other values (24) | 408 | 7.4% |
| (Missing) | 150 | 2.7% |
| Value | Count | Frequency (%) |
| 0 | 2812 | |
| 1 | 562 | 10.2% |
| 2 | 393 | 7.1% |
| 3 | 290 | 5.2% |
| 4 | 234 | 4.2% |
| 5 | 204 | 3.7% |
| 6 | 159 | 2.9% |
| 7 | 130 | 2.4% |
| 8 | 105 | 1.9% |
| 9 | 64 | 1.2% |
| Value | Count | Frequency (%) |
| 18000 | 1 | < 0.1% |
| 17000 | 1 | < 0.1% |
| 14000 | 1 | < 0.1% |
| 12000 | 1 | < 0.1% |
| 10000 | 1 | < 0.1% |
| 8000 | 2 | < 0.1% |
| 7000 | 1 | < 0.1% |
| 6000 | 3 | |
| 5000 | 7 | |
| 4000 | 5 |
| Distinct | 69 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.20600598 |
| Minimum | 0 |
|---|---|
| Maximum | 1000 |
| Zeros | 1392 |
| Zeros (%) | 25.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0.363636 |
| Q3 | 0.833333 |
| 95-th percentile | 1 |
| Maximum | 1000 |
| Range | 1000 |
| Interquartile range (IQR) | 0.833333 |
Descriptive statistics
| Standard deviation | 93.75767056 |
|---|---|
| Coefficient of variation (CV) | 7.681273525 |
| Kurtosis | 82.01112325 |
| Mean | 12.20600598 |
| Median Absolute Deviation (MAD) | 0.363636 |
| Skewness | 8.892601479 |
| Sum | 67499.21305 |
| Variance | 8790.500789 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1392 | |
| 1 | 881 | |
| 0.083333 | 465 | 8.4% |
| 0.5 | 277 | 5.0% |
| 0.166667 | 274 | 5.0% |
| 0.25 | 237 | 4.3% |
| 0.333333 | 233 | 4.2% |
| 0.833333 | 230 | 4.2% |
| 0.416667 | 216 | 3.9% |
| 0.666667 | 211 | 3.8% |
| Other values (59) | 1114 |
| Value | Count | Frequency (%) |
| 0 | 1392 | |
| 0.083333 | 465 | 8.4% |
| 0.090909 | 35 | 0.6% |
| 0.1 | 18 | 0.3% |
| 0.111111 | 12 | 0.2% |
| 0.125 | 20 | 0.4% |
| 0.142857 | 17 | 0.3% |
| 0.166667 | 274 | 5.0% |
| 0.181818 | 11 | 0.2% |
| 0.2 | 15 | 0.3% |
| Value | Count | Frequency (%) |
| 1000 | 26 | |
| 916.667 | 6 | 0.1% |
| 900 | 1 | < 0.1% |
| 857.143 | 1 | < 0.1% |
| 833.333 | 4 | 0.1% |
| 818.182 | 1 | < 0.1% |
| 750 | 5 | 0.1% |
| 714.286 | 1 | < 0.1% |
| 700 | 2 | < 0.1% |
| 666.667 | 1 | < 0.1% |
| Distinct | 80 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 43.3 KiB |
| 0 | |
|---|---|
| 1 | |
| 12 | |
| 2 | 252 |
| 6 | 243 |
| Other values (75) |
Length
| Max length | 7 |
|---|---|
| Median length | 1 |
| Mean length | 1.45045208 |
| Min length | 1 |
Characters and Unicode
| Total characters | 8021 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 13 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 0 |
| 3rd row | 12 |
| 4th row | 14 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1353 | |
| 1 | 460 | 8.3% |
| 12 | 387 | 7.0% |
| 2 | 252 | 4.6% |
| 6 | 243 | 4.4% |
| 4 | 203 | 3.7% |
| 3 | 196 | 3.5% |
| 5 | 190 | 3.4% |
| 8 | 186 | 3.4% |
| 7 | 184 | 3.3% |
| Other values (70) | 1876 |
Length
| Value | Count | Frequency (%) |
| 0 | 1353 | |
| 1 | 460 | 8.3% |
| 12 | 387 | 7.0% |
| 2 | 252 | 4.6% |
| 6 | 243 | 4.4% |
| 4 | 203 | 3.7% |
| 3 | 196 | 3.5% |
| 5 | 190 | 3.4% |
| 8 | 186 | 3.4% |
| 7 | 184 | 3.3% |
| Other values (70) | 1876 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2014 | |
| 0 | 1989 | |
| 2 | 1285 | |
| 3 | 410 | 5.1% |
| 4 | 408 | 5.1% |
| 6 | 393 | 4.9% |
| 7 | 330 | 4.1% |
| 5 | 328 | 4.1% |
| 8 | 299 | 3.7% |
| 9 | 271 | 3.4% |
| Other values (2) | 294 | 3.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 7727 | |
| Other Punctuation | 212 | 2.6% |
| Lowercase Letter | 82 | 1.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2014 | |
| 0 | 1989 | |
| 2 | 1285 | |
| 3 | 410 | 5.3% |
| 4 | 408 | 5.3% |
| 6 | 393 | 5.1% |
| 7 | 330 | 4.3% |
| 5 | 328 | 4.2% |
| 8 | 299 | 3.9% |
| 9 | 271 | 3.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| ? | 212 |
Lowercase Letter
| Value | Count | Frequency (%) |
| ñ | 82 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 7939 | |
| Latin | 82 | 1.0% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2014 | |
| 0 | 1989 | |
| 2 | 1285 | |
| 3 | 410 | 5.2% |
| 4 | 408 | 5.1% |
| 6 | 393 | 5.0% |
| 7 | 330 | 4.2% |
| 5 | 328 | 4.1% |
| 8 | 299 | 3.8% |
| 9 | 271 | 3.4% |
Latin
| Value | Count | Frequency (%) |
| ñ | 82 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7939 | |
| Latin 1 Sup | 82 | 1.0% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2014 | |
| 0 | 1989 | |
| 2 | 1285 | |
| 3 | 410 | 5.2% |
| 4 | 408 | 5.1% |
| 6 | 393 | 5.0% |
| 7 | 330 | 4.2% |
| 5 | 328 | 4.1% |
| 8 | 299 | 3.8% |
| 9 | 271 | 3.4% |
Latin 1 Sup
| Value | Count | Frequency (%) |
| ñ | 82 |
ONEOFF_PURCHASES_FREQUENCY
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONMISSINGZEROS| Distinct | 41 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 2740 |
| Missing (%) | 49.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1482977523 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 1464 |
| Zeros (%) | 26.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0.166667 |
| 95-th percentile | 0.75 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 0.166667 |
Descriptive statistics
| Standard deviation | 0.241687055 |
|---|---|
| Coefficient of variation (CV) | 1.629741862 |
| Kurtosis | 3.442475174 |
| Mean | 0.1482977523 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.013350785 |
| Sum | 413.750729 |
| Variance | 0.05841263257 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1464 | |
| 0.083333 | 376 | 6.8% |
| 0.166667 | 214 | 3.9% |
| 0.25 | 131 | 2.4% |
| 0.333333 | 88 | 1.6% |
| 0.416667 | 83 | 1.5% |
| 1 | 64 | 1.2% |
| 0.5 | 61 | 1.1% |
| 0.583333 | 42 | 0.8% |
| 0.666667 | 38 | 0.7% |
| Other values (31) | 229 | 4.1% |
| (Missing) | 2740 |
| Value | Count | Frequency (%) |
| 0 | 1464 | |
| 0.083333 | 376 | 6.8% |
| 0.090909 | 23 | 0.4% |
| 0.1 | 13 | 0.2% |
| 0.111111 | 11 | 0.2% |
| 0.125 | 11 | 0.2% |
| 0.142857 | 14 | 0.3% |
| 0.166667 | 214 | 3.9% |
| 0.181818 | 14 | 0.3% |
| 0.2 | 12 | 0.2% |
| Value | Count | Frequency (%) |
| 1 | 64 | |
| 0.916667 | 28 | |
| 0.909091 | 1 | < 0.1% |
| 0.875 | 1 | < 0.1% |
| 0.833333 | 21 | 0.4% |
| 0.818182 | 1 | < 0.1% |
| 0.75 | 36 | |
| 0.727273 | 1 | < 0.1% |
| 0.714286 | 2 | < 0.1% |
| 0.7 | 4 | 0.1% |
| Distinct | 46 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 166 |
| Missing (%) | 3.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1190054092 |
| Minimum | 0 |
|---|---|
| Maximum | 1.5 |
| Zeros | 2801 |
| Zeros (%) | 50.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0.166667 |
| 95-th percentile | 0.5 |
| Maximum | 1.5 |
| Range | 1.5 |
| Interquartile range (IQR) | 0.166667 |
Descriptive statistics
| Standard deviation | 0.1732062886 |
|---|---|
| Coefficient of variation (CV) | 1.455448872 |
| Kurtosis | 3.499384508 |
| Mean | 0.1190054092 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.786846819 |
| Sum | 638.345015 |
| Variance | 0.03000041842 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2801 | |
| 0.083333 | 664 | 12.0% |
| 0.166667 | 466 | 8.4% |
| 0.25 | 360 | 6.5% |
| 0.333333 | 258 | 4.7% |
| 0.416667 | 155 | 2.8% |
| 0.5 | 105 | 1.9% |
| 0.583333 | 75 | 1.4% |
| 0.666667 | 56 | 1.0% |
| 0.090909 | 49 | 0.9% |
| Other values (36) | 375 | 6.8% |
| (Missing) | 166 | 3.0% |
| Value | Count | Frequency (%) |
| 0 | 2801 | |
| 0.083333 | 664 | 12.0% |
| 0.090909 | 49 | 0.9% |
| 0.1 | 28 | 0.5% |
| 0.111111 | 18 | 0.3% |
| 0.125 | 33 | 0.6% |
| 0.142857 | 33 | 0.6% |
| 0.166667 | 466 | 8.4% |
| 0.181818 | 27 | 0.5% |
| 0.2 | 15 | 0.3% |
| Value | Count | Frequency (%) |
| 1.5 | 1 | < 0.1% |
| 1.166667 | 1 | < 0.1% |
| 1 | 4 | |
| 0.916667 | 2 | < 0.1% |
| 0.9 | 1 | < 0.1% |
| 0.875 | 1 | < 0.1% |
| 0.857143 | 4 | |
| 0.833333 | 8 | |
| 0.8 | 3 | 0.1% |
| 0.777778 | 1 | < 0.1% |
| Distinct | 134 |
|---|---|
| Distinct (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3588.095256 |
| Minimum | 50 |
|---|---|
| Maximum | 12500 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.3 KiB |
Quantile statistics
| Minimum | 50 |
|---|---|
| 5-th percentile | 1000 |
| Q1 | 1500 |
| median | 2900 |
| Q3 | 5000 |
| 95-th percentile | 9000 |
| Maximum | 12500 |
| Range | 12450 |
| Interquartile range (IQR) | 3500 |
Descriptive statistics
| Standard deviation | 2640.396238 |
|---|---|
| Coefficient of variation (CV) | 0.7358768509 |
| Kurtosis | 0.5970263702 |
| Mean | 3588.095256 |
| Median Absolute Deviation (MAD) | 1500 |
| Skewness | 1.145162447 |
| Sum | 19842166.77 |
| Variance | 6971692.293 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3000 | 563 | 10.2% |
| 1500 | 542 | 9.8% |
| 1200 | 457 | 8.3% |
| 1000 | 454 | 8.2% |
| 2500 | 426 | 7.7% |
| 4000 | 317 | 5.7% |
| 6000 | 281 | 5.1% |
| 2000 | 280 | 5.1% |
| 5000 | 225 | 4.1% |
| 7000 | 147 | 2.7% |
| Other values (124) | 1838 |
| Value | Count | Frequency (%) |
| 50 | 1 | < 0.1% |
| 150 | 4 | 0.1% |
| 200 | 3 | 0.1% |
| 300 | 12 | 0.2% |
| 400 | 2 | < 0.1% |
| 450 | 4 | 0.1% |
| 500 | 87 | |
| 600 | 15 | 0.3% |
| 700 | 17 | 0.3% |
| 750 | 3 | 0.1% |
| Value | Count | Frequency (%) |
| 12500 | 12 | 0.2% |
| 12000 | 31 | |
| 11500 | 25 | 0.5% |
| 11000 | 25 | 0.5% |
| 10750 | 1 | < 0.1% |
| 10500 | 41 | |
| 10400 | 1 | < 0.1% |
| 10000 | 69 | |
| 9950 | 1 | < 0.1% |
| 9700 | 1 | < 0.1% |
| Distinct | 5530 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1107.989817 |
| Minimum | 0.056466 |
|---|---|
| Maximum | 9933.62261 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.3 KiB |
Quantile statistics
| Minimum | 0.056466 |
|---|---|
| 5-th percentile | 124.9274707 |
| Q1 | 345.4311015 |
| median | 671.0016995 |
| Q3 | 1354.931507 |
| 95-th percentile | 3710.658747 |
| Maximum | 9933.62261 |
| Range | 9933.566144 |
| Interquartile range (IQR) | 1009.500406 |
Descriptive statistics
| Standard deviation | 1270.892564 |
|---|---|
| Coefficient of variation (CV) | 1.147025491 |
| Kurtosis | 9.951139009 |
| Mean | 1107.989817 |
| Median Absolute Deviation (MAD) | 399.8415645 |
| Skewness | 2.78151989 |
| Sum | 6127183.69 |
| Variance | 1615167.91 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 74.750172 | 1 | < 0.1% |
| 567.149683 | 1 | < 0.1% |
| 660.566074 | 1 | < 0.1% |
| 266.472386 | 1 | < 0.1% |
| 1230.180271 | 1 | < 0.1% |
| 474.469741 | 1 | < 0.1% |
| 1262.938606 | 1 | < 0.1% |
| 728.19268 | 1 | < 0.1% |
| 654.1843 | 1 | < 0.1% |
| 3189.923177 | 1 | < 0.1% |
| Other values (5520) | 5520 |
| Value | Count | Frequency (%) |
| 0.056466 | 1 | |
| 3.500505 | 1 | |
| 4.523555 | 1 | |
| 4.841543 | 1 | |
| 9.533313 | 1 | |
| 12.773144 | 1 | |
| 16.385421 | 1 | |
| 18.125527 | 1 | |
| 18.208604 | 1 | |
| 18.336805 | 1 |
| Value | Count | Frequency (%) |
| 9933.62261 | 1 | |
| 9858.055448 | 1 | |
| 9801.637331 | 1 | |
| 9724.871142 | 1 | |
| 9614.697558 | 1 | |
| 9307.719055 | 1 | |
| 9076.561132 | 1 | |
| 8972.867229 | 1 | |
| 8919.228234 | 1 | |
| 8805.280436 | 1 |
| Distinct | 5441 |
|---|---|
| Distinct (%) | 98.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 43.3 KiB |
| ?? | 89 |
|---|---|
| 299.351881 | 2 |
| 119.325453 | 1 |
| 967.565584 | 1 |
| 175.245981 | 1 |
| Other values (5436) |
Length
| Max length | 13 |
|---|---|
| Median length | 10 |
| Mean length | 9.779385172 |
| Min length | 2 |
Characters and Unicode
| Total characters | 54080 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 5439 ? |
|---|---|
| Unique (%) | 98.4% |
Sample
| 1st row | 56.999671 |
|---|---|
| 2nd row | 195.162256 |
| 3rd row | 270.413449 |
| 4th row | 194.534934 |
| 5th row | 1129.747227 |
Common Values
| Value | Count | Frequency (%) |
| ?? | 89 | 1.6% |
| 299.351881 | 2 | < 0.1% |
| 119.325453 | 1 | < 0.1% |
| 967.565584 | 1 | < 0.1% |
| 175.245981 | 1 | < 0.1% |
| 1013.780486 | 1 | < 0.1% |
| 373.884808 | 1 | < 0.1% |
| 705.810164 | 1 | < 0.1% |
| 926.087148 | 1 | < 0.1% |
| 189.459157 | 1 | < 0.1% |
| Other values (5431) | 5431 |
Length
| Value | Count | Frequency (%) |
| 89 | 1.6% | |
| 299.351881 | 2 | < 0.1% |
| 432.876404 | 1 | < 0.1% |
| 431.443631 | 1 | < 0.1% |
| 967.565584 | 1 | < 0.1% |
| 175.245981 | 1 | < 0.1% |
| 1013.780486 | 1 | < 0.1% |
| 373.884808 | 1 | < 0.1% |
| 705.810164 | 1 | < 0.1% |
| 926.087148 | 1 | < 0.1% |
| Other values (5431) | 5431 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 6665 | |
| . | 5441 | |
| 2 | 5081 | |
| 4 | 4849 | |
| 3 | 4794 | |
| 5 | 4701 | |
| 7 | 4670 | |
| 6 | 4669 | |
| 8 | 4598 | |
| 9 | 4539 | |
| Other values (3) | 4073 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 48305 | |
| Other Punctuation | 5697 | 10.5% |
| Lowercase Letter | 78 | 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 6665 | |
| 2 | 5081 | |
| 4 | 4849 | |
| 3 | 4794 | |
| 5 | 4701 | |
| 7 | 4670 | |
| 6 | 4669 | |
| 8 | 4598 | |
| 9 | 4539 | |
| 0 | 3739 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 5441 | |
| ? | 256 | 4.5% |
Lowercase Letter
| Value | Count | Frequency (%) |
| ñ | 78 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 54002 | |
| Latin | 78 | 0.1% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 6665 | |
| . | 5441 | |
| 2 | 5081 | |
| 4 | 4849 | |
| 3 | 4794 | |
| 5 | 4701 | |
| 7 | 4670 | |
| 6 | 4669 | |
| 8 | 4598 | |
| 9 | 4539 | |
| Other values (2) | 3995 |
Latin
| Value | Count | Frequency (%) |
| ñ | 78 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 54002 | |
| Latin 1 Sup | 78 | 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 6665 | |
| . | 5441 | |
| 2 | 5081 | |
| 4 | 4849 | |
| 3 | 4794 | |
| 5 | 4701 | |
| 7 | 4670 | |
| 6 | 4669 | |
| 8 | 4598 | |
| 9 | 4539 | |
| Other values (2) | 3995 |
Latin 1 Sup
| Value | Count | Frequency (%) |
| ñ | 78 |
| Distinct | 19 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 163 |
| Missing (%) | 2.9% |
| Memory size | 43.3 KiB |
| 12 | |
|---|---|
| 11 | 224 |
| 10 | 149 |
| 6 | 135 |
| 7 | 125 |
| Other values (14) |
Length
| Max length | 4 |
|---|---|
| Median length | 2 |
| Mean length | 1.958263462 |
| Min length | 1 |
Characters and Unicode
| Total characters | 10510 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 8 |
|---|---|
| 2nd row | 12 |
| 3rd row | -12 |
| 4th row | 12 |
| 5th row | 12 |
Common Values
| Value | Count | Frequency (%) |
| 12 | 4226 | |
| 11 | 224 | 4.1% |
| 10 | 149 | 2.7% |
| 6 | 135 | 2.4% |
| 7 | 125 | 2.3% |
| -12 | 124 | 2.2% |
| 8 | 119 | 2.2% |
| 9 | 108 | 2.0% |
| ?? | 69 | 1.2% |
| 12?ñ | 56 | 1.0% |
| Other values (9) | 32 | 0.6% |
| (Missing) | 163 | 2.9% |
Length
| Value | Count | Frequency (%) |
| 12 | 4350 | |
| 11 | 232 | 4.3% |
| 10 | 155 | 2.9% |
| 6 | 135 | 2.5% |
| 7 | 131 | 2.4% |
| 8 | 121 | 2.3% |
| 9 | 110 | 2.0% |
| 69 | 1.3% | |
| 12?ñ | 56 | 1.0% |
| 11?ñ | 3 | 0.1% |
| Other values (3) | 5 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 5033 | |
| 2 | 4406 | |
| ? | 202 | 1.9% |
| 0 | 157 | 1.5% |
| - | 148 | 1.4% |
| 6 | 135 | 1.3% |
| 7 | 131 | 1.2% |
| 8 | 123 | 1.2% |
| 9 | 111 | 1.1% |
| ñ | 64 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 10096 | |
| Other Punctuation | 202 | 1.9% |
| Dash Punctuation | 148 | 1.4% |
| Lowercase Letter | 64 | 0.6% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 5033 | |
| 2 | 4406 | |
| 0 | 157 | 1.6% |
| 6 | 135 | 1.3% |
| 7 | 131 | 1.3% |
| 8 | 123 | 1.2% |
| 9 | 111 | 1.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 148 |
Other Punctuation
| Value | Count | Frequency (%) |
| ? | 202 |
Lowercase Letter
| Value | Count | Frequency (%) |
| ñ | 64 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 10446 | |
| Latin | 64 | 0.6% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 5033 | |
| 2 | 4406 | |
| ? | 202 | 1.9% |
| 0 | 157 | 1.5% |
| - | 148 | 1.4% |
| 6 | 135 | 1.3% |
| 7 | 131 | 1.3% |
| 8 | 123 | 1.2% |
| 9 | 111 | 1.1% |
Latin
| Value | Count | Frequency (%) |
| ñ | 64 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10446 | |
| Latin 1 Sup | 64 | 0.6% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 5033 | |
| 2 | 4406 | |
| ? | 202 | 1.9% |
| 0 | 157 | 1.5% |
| - | 148 | 1.4% |
| 6 | 135 | 1.3% |
| 7 | 131 | 1.3% |
| 8 | 123 | 1.2% |
| 9 | 111 | 1.1% |
Latin 1 Sup
| Value | Count | Frequency (%) |
| ñ | 64 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| CUST_ID | GENDER | BALANCE | PURCHASES | BALANCE_FREQUENCY | CASH_ADVANCE | CASH_ADVANCE_TRX | PURCHASES_FREQUENCY | PURCHASES_TRX | ONEOFF_PURCHASES_FREQUENCY | CASH_ADVANCE_FREQUENCY | CREDIT_LIMIT | PAYMENTS | MINIMUM_PAYMENTS | TENURE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | C12529 | F | 107.944741 | 118.16 | 0.875000 | 472.818286 | 1.0 | 0.125000 | 2 | 0.125 | 0.125000 | 2500.0 | 192.781455 | 56.999671 | 8 |
| 1 | C14138 | NaN | 241.032979 | 0.00 | 1.000000 | 642.862505 | 1.0 | 0.000000 | 0 | NaN | 0.083333 | 1500.0 | 915.454305 | 195.162256 | 12 |
| 2 | C15409 | NaN | 894.357857 | 1164.00 | 1.000000 | 0.0 | 0.0 | 1.000000 | 12 | NaN | 0.000000 | 2000.0 | 907.603723 | 270.413449 | -12 |
| 3 | C18141 | F | -188.132508 | 515.88 | 1.000000 | 0.0 | NaN | 0.833333 | 14 | NaN | 0.000000 | 2700.0 | 601.729266 | 194.534934 | 12 |
| 4 | C15879 | NaN | 3881.679582 | 15.92 | 1.000000 | 2183.782456 | 9.0 | 0.083333 | 1 | NaN | 0.333333 | 5500.0 | 1032.183632 | 1129.747227 | 12 |
| 5 | C17660 | NaN | 1087.784698 | 0.00 | 1.000000 | 1562.703953 | 2.0 | 0.000000 | 0 | 0.000 | 0.166667 | 1500.0 | 3093.888643 | 298.011965 | 12 |
| 6 | C10916 | NaN | 1081.065726 | 554.85 | 1.000000 | 952.424906 | 8.0 | 0.500000 | 20 | 0.250 | 0.166667 | 2100.0 | 1898.828120 | 382.716751 | 12 |
| 7 | C15128 | NaN | 100.208311 | 0.00 | 0.909091 | 182.143966 | 1.0 | 0.000000 | 0 | NaN | 0.090909 | 3000.0 | 175.911508 | 145.244181 | 11 |
| 8 | C10109 | NaN | 862.072380 | 0.00 | 1.000000 | 920.309805 | 1.0 | 0.000000 | 0 | 0.000 | 0.083333 | 4000.0 | 2236.890255 | 214.828158 | 12 |
| 9 | C17983 | NaN | 1757.439933 | 0.00 | 0.833333 | 2408.007601 | 6.0 | 0.000000 | 0 | 0.000 | 0.166667 | 2500.0 | 175.115831 | 450.616731 | 6 |
Last rows
| CUST_ID | GENDER | BALANCE | PURCHASES | BALANCE_FREQUENCY | CASH_ADVANCE | CASH_ADVANCE_TRX | PURCHASES_FREQUENCY | PURCHASES_TRX | ONEOFF_PURCHASES_FREQUENCY | CASH_ADVANCE_FREQUENCY | CREDIT_LIMIT | PAYMENTS | MINIMUM_PAYMENTS | TENURE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5520 | C16104 | NaN | 2525.683344 | 0.00 | 1.000000 | 285.193204 | 5.0 | 0.000000 | 0 | NaN | 0.166667 | 7000.0 | 1483.384610 | 702.052491 | 12 |
| 5521 | C19019 | NaN | 634.514354 | 0.00 | 0.909091 | 1682.137421 | 12.0 | 0.000000 | 0 | 0.000000 | 0.636364 | 1500.0 | 2162.277429 | 257.081648 | 11 |
| 5522 | C18355 | NaN | 930.656420 | 300.05 | 1.000000 | 0.0 | 0.0 | 0.750000 | 9 | NaN | 0.000000 | 1200.0 | 513.064156 | 330.422815 | 12 |
| 5523 | C18766 | NaN | 21.168201 | 236.40 | 1.000000 | 0.0 | 0.0 | 1.000000 | 24 | 1.000000 | NaN | 2500.0 | 217.008342 | 178.169321 | 12 |
| 5524 | C16616 | NaN | 846.091011 | 2599.20 | 1.000000 | 0.0 | 0.0 | 0.916667 | 19 | 0.333333 | 0.000000 | 3000.0 | 1900.699307 | 195.516066 | 12 |
| 5525 | C10075 | NaN | 656.013010 | 0.00 | 1000.000000 | 1474.349901 | 3.0 | 0.000000 | 0 | 0.000000 | 0.125000 | 7000.0 | 910.457985 | 140.983193 | 8 |
| 5526 | C17321 | NaN | 15.232505 | 384.00 | 0.272727 | 0.0 | 0.0 | 1.000000 | 12?ñ | NaN | 0.000000 | 1500.0 | 568.982664 | 54.449416 | 12 |
| 5527 | C12909 | NaN | 1023.124791 | 1537.93 | 1.000000 | 247.04197 | 1.0 | 0.750000 | 25 | 0.583333 | 0.083333 | 9000.0 | 1070.149971 | 235.241959 | -12 |
| 5528 | C15615 | F | 957.010021 | 604.80 | 1.000000 | 901.754709 | 3.0 | 1.000000 | 12 | NaN | 0.083333 | 1000.0 | 811.457190 | 926.087148 | 12 |
| 5529 | C12391 | NaN | 2664.700424 | 715.51 | 1.000000 | 494.573662 | 1.0 | 750.000000 | 11 | 0.083333 | 0.083333 | 3500.0 | 918.003032 | 792.902894 | 12 |